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Abstract 

This  document  is  the  final  report  for  AFOSR  Grant  F49620>01-l-0456,  “Workstation  Cluster  for  Simu¬ 
lations  of  Quantum  Lattice-Gas  Automata  and  Entropic  Lattice  Boltzmann  Models.”  Under  the  terms 
of  this  grant,  a  workstation  cluster  was  purchased  by  Professor  Bruce  Boghosian  of  the  Department  of 
Mathematics  at  Tufts  University  in  order  to  carry  out  large-scale  simulations  in  support  of  his  other 
AFOSR  project  F49620-01-1-0385,  “Quantum  Lattice-Gas  Automata  and  Hydrodynamics,”  which  was 
funded  by  AFOSR  for  three  years,  beginning  in  2001. 

This  report  is  restricted  to  a  description  of  the  computer  that  was  purchased,  and  to  the  process  of 
purchasing  it.  A  description  of  the  results  thus  obtained  will  be  included  in  the  final  report  of  AFOSR 
grant  F49620-01-1-0385. 
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1  Description  of  Project 

Professor  Bruce  M.  Boghosian  of  the  Department  of  Mathematics  at  Tufts  University  is  currently  receiving 
funding  under  AFOSR  project  F49620-0UT0385,  “Quantum  Lattice-Gas  Automata  and  Hydrod>mamics,’^ 
to  carry  out  large-scale  simulations  in  support  of  the  Quantum  Computation  effort  at  Air  Force  Research 
Laboratory  at  Hanscom  AFB.  This  project  has  been  funded  for  three  years  beginning  in  2001.  One  of  the 
principal  areas  of  study  is  quantum  lattice-gas  automata  as  a  paradigm  for  quantum  computation.  From 
the  outset,  it  became  clear  that  the  project  would  require  substantial  computational  facilities  in  order  to 
simulate  quantum  lattice-gas  automata. 

To  understand  the  computational  requirements  involved,  consider  a  quantum  lattice-gas  model  for  dif¬ 
fusion  in  one  spatial  dimension.  To  understand  this,  we  first  describe  a  corresponding  classical  model.  We 
suppose  that  we  have  a  lattice  of  N  sites,  with  up  to  three  particles  per  site,  each  of  unit  mass.  The  three 
particles  may  be  associated  with  negative  velocity,  zero  velocity  and  positive  velocity,  respectively.  In  each 
time  step  of  the  model,  the  particles  stream  in  the  direction  of  their  velocity  to  the  next  site,  and  undergo 
a  collision.  In  order  to  simulate  diflFusion,  the  collisions  should  conserve  mass,  but  not  momentum.  One 
possibility,  investigated  many  years  ago  in  a  paper  by  Boghosian  and  Taylor  [1],  is  to  have  collisions  with 
probability  p  only  if  there  are  exactly  two  particles  a~.  the  site.  In  the  event  of  such  a  collision,  the  result  is 
one  of  the  other  two  two-particle  states  with  even  odds.  This  simple  classical  model  was  shown  to  have  an 
average  density  p  €  [0,3]  that  obeys  the  diffusion  equation 


dt 


with  nonlinearity  introduced  by  the  density-dependent  diffusivity  ^ 


Now  consider  the  problem  of  quantizing  the  above-described  lattice-gas  automaton.  For  simplicity,  we 
eliminate  the  rest  particle,  and  allow  a  nontrivial  collision  only  when  there  is  one  particle  present  at  a  site. 
We  suppose  that  the  collision  takes  the  particle  to  the  other  one-particle  state  for  that  site,  or  leaves  it 
unchanged,  with  even  odds.  The  system  as  a  whole  has  2N  bits,  and  so  it  may  be  in  any  one  of  2^^  =  4^ 
states.  A  quantum  model  will  assign  a  complex  amplitude  to  each  of  these  states.  The  evolution  of  this 
amplitude  vector  will  be  given  by  the  action  of  a  4^  x  4^  unitary  matrix. 

The  unitary  matrix  described  above  has  16^  entries.  Fortunately,  most  of  them  are  zero,  so  the  matrix  is 
sparse.  A  set  of  states  with  the  same  masses  at  each  site  constitute  an  equivalence  class.  A  collision  will  take 
any  state  into  another  in  the  same  equivalence  class.  The  size  of  an  equivalence  class  is  thus  2^",  where  n  is 
the  number  of  two-particle  sites  on  the  lattice.  The  number  of  equivalence  classes  of  size  2”  may  be  found  by 
noting  that  the  n  two-particle  sites  may  be  placed  in  any  of  (  ^  )  configurations,  and  the  remaining  N  -n 
sites  may  be  either  zero-particle  or  three-particle  sites  for  a  total  of  2^”’^  possibilities.  Thus,  there  are  a 
total  of 

N 

N  )2^-^  =  (1  +  2)^  =  3^ 

n=0 

equivalence  classes.  The  number  of  nonzero  elements  of  the  unitary  matrix  is  then 


-  )2^-"  (2"f  =  (  «  )2^+"  =  22^^(  -  )  Q 

n=0  n=0  n=0 

The  fraction  of  matrix  elements  that  are  nonzero  is  thus 


AT 


N 


N-n 


=  2 


2N 


6^. 


^  There  are  minor  corrections  to  the  formula  for  the  diffusivity,  the  origins  of  which  are  interesting  and  discussed  in  detail  in 
the  reference  [1],  but  they  are  not  essential  to  the  present  discussion. 
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The  si  mill  at  ion  of  the  evolution  of  this  quantum  lattice-gas  automaton  thus  requires  the  ability  to  manij)- 
ulate  large  sparse  matrices.  A  single  time  step  of  this  automaton  for  a  lattice  of  onl\  twelve  sites  will  require 
the  multiplication  of  complex  vectors  of  length  4^^  ~  1.68  x  10^  with  unitary  matrices  of  6^^  ^  2.18  x  10^ 
nonzero  elements.  The  extension  of  these  methods  to  higher  dimensions,  or  to  more  complicated  quantum 
lattice  gases  for  fluid  dynamics  will  thus  require  enormous  amounts  of  computing  ability,  as  well  as  anaKtic 
and  computational  techniques  more  advanced  than  those  alluded  to  in  the  above  discussion. 

Likewise,  the  classical  simulation  of  entropic  lattice-Boltzmann  models  has  also  been  showm  [2]  to  require 
huge  computational  resources.  Entropic  lattice-Boltzmann  models  hold  the  promise  of  guaranteed  stability 
for  arbitrarily  small  viscosity.  While  this  still  does  not  mitigate  the  turbulence  problem  for  Direct  Numerical 
Simulation  of  viscous  fluids,  since  the  lattice  size  would  still  need  to  be  large  enough  to  resolve  the  smallest 
eddies,  it  may  have  better  computational  scaling  than  other  alternatives.  It  may  also  introduce  a  natural 
kind  of  eddy  viscosity  when  the  smallest  eddies  are  not  resolved. 

For  all  of  these  reasons,  AFOSR  Grant  F49620-01-1-0456  requested  $34,942  for  the  purchase  of  a  work¬ 
station  cluster  for  these  kinds  of  simulations.  The  proposal  specified  a  cluster  of  Apple  G4  processors, 
connected  by  a  high-bandwidth  Myranet  network.  This  was  the  first  platform  considered,  and  Apple  loaned 
one  such  processor  to  the  PI  during  the  summer  of  2001  for  testing  purposes.  It  performed  very  well  in  our 
tests  using  the  Absoft  Fortran  complier.  Much  of  om-  motivation  was  based  on  the  successes  of  Decyk  and 
his  collaborators  at  UCLA  [3].  At  the  same  time,  there  was  concern  that  the  Macintosh  platform  was  not 
really  designed  for  clustering,  so  that  software  amenities  such  as  easy-to-use  message-passing  libraries,  and 
hardware  amenities  such  as  rack  mounts  were  not  available 

In  the  end,  we  decided  to  purchase  a  Linux  cluster  from  Microway,  Inc.  This  company,  located  in 
Plymouth,  MA,  specializes  in  the  construction  and  assembly  of  Linux  clusters.  They  deliver  and  install 
them  preloaded  with  software,  two-year  offsite  hardware  warantee,  and  technical  support  for  users  for  the 
lifetime  of  the  computer.  The  software  purchased  included  the  Portland  Group  suite  of  Fortran  compilers, 
along  with  the  MPI  message-passing  library. 

The  final  quote  received  from  Microway  was  dated  October  24,  2001.  After  experimenting  with  cost 
estimates  for  various  configurations,  we  decided  on  a  dual  Athlon  MP  1600-1-  (1.4GHz)  cluster  in  2U  Rack- 
mounts,  for  a  total  of  18  CPU’s.  An  accounting  of  the  hardware  is  given  in  Fig.  2.  The  total  cost  for  the 
above  list  of  items  was  thus  $41,234.  Of  this,  $34,942  came  from  the  AFOSR  grant,  and  the  remainder  was 
funded  by  Tufts  University. 

The  machine  was  delivered  on  February  2,  2002,  and  Microway  returned  to  install  it  on  February  13,  2002. 
Tufts  University  provided  a  home  for  the  machine  in  their  main  machine  room  in  the  Tufts  Administration 
Building.  This  machine  room  houses  all  of  the  mainframes  used  on  campus,  and  is  equipped  with  air 
conditioning,  UPS  power  sufficient  to  keep  all  machines  running  for  an  hour  after  an  outage,  halon  fire 
protection,  and  is  staffed  24/7.  The  extra  funds  and  the  machine  room  facility  provided  by  Tufts  University 
were  a  de  facto,  if  not  de  jure,  form  of  cost  sharing  that  was  not  mentioned  in  the  proposal  itself. 

The  cluster  compiled  and  ran  a  Fortran  90  program,  with  MPI  communication,  shortly  after  it  was 
installed.  It  has  been  working  reliably  ever  since.  In  the  four  months  since  it  was  installed,  it  was  taken 
down  only  once  for  a  planned  power  outage.  We  are  still  pleased  with  our  selection  of  the  Athlon-based 
Beowulf  cluster,  and  look  forward  to  several  years  of  good  use.  We  are  especially  pleased  that  the  installation 
(both  hardware  and  software)  provided  by  Microway  shielded  us  from  the  innumerable  details  involved  in 
getting  such  a  cluster  up  and  working,  and  allowed  us  to  remain  focused  on  our  research. 


2  Conclusions 

We  have  described  the  acquisition  of  a  Beowulf  computer  cluster  by  the  PI,  located  in  the  Department  of 
Mathematics.  The  cluster  will  be  used  in  support  of  AFOSR  grant  F49620-01-1-0385,  providing  mathematical 
and  technical  assistance  to  the  Quantum  Computation  group  at  Air  Force  Research  Laboratory  at  Hanscom 
AFB.  The  cluster  was  successfully  purchased,  installed,  and  is  currently  working  well. 

^Alas,  for  us  and  for  them,  Apple  introduced  rack  mounts  in  the  spring,  after  our  purchase  was  made. 
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1.  (QTY:  1)  Dual  Athlon  MP  1600+  Master  Node  (4U) 

(a)  Tyan  S2462UNG  Dual  Athlon  MP  1600+  CPUs  w/384K  Cache  per  CPU  with  AMD-760  MP  chipset  and  200/266MHz 
FSB 

(b)  5  64/32-bit  PCI  slots,  AGP  Pro  50  Slot,  2  Serial/l  Parallel/2  USB 

(c)  2GB  DDR  SDRAM  (2  Each  IGB  DDR  Registered  266MHz  DIMMs) 

(d)  4U  19  RackServer  Chassis  with  24  slide  rails 

(e)  460  Watt  power  supply 

(f)  Onboard  Adaptec  AIC-7899W  Dual  Channel  Ultra3/160  SCSI  Controller 

(g)  2  Each  -  36GB  Ultra/160  10000  RPM  SCSI  HDD 

(h)  Sony  SDT9000/BM  12/24GB  Internal  4mm  DAT  DDS3  Tape  Backup 

(i)  Integrated  ATI  RAGE  XL  4MB  Video 

(j)  2  Each  3COM  3C996-T  64-bit  Gigabit  PCI  Ethernet  (TD825603) 

(k)  52X  CDROM  IDE  Drive 

(l)  3  Floppy  Drive 

(m)  Red  Hat  Linux  (CD)  7.1  with  MPICH  (Installed) 

Item  1  Unit  Price:  $5,195 
Item  1  Ext.  Total:  $5,195 

2.  (QTY:  8)  Dual  Athlon  MP  1600+  Compute  Node  (2U) 

(a)  Tyan  S2462UNG  Dual  Athlon  MP  1600+  CPUs  w/384K  Cache  per  CPU  witb  AMD-760  MP  chipset  and  200/266MHz 
FSB 

(b)  5  64/32-bit  PCI  slots,  AGP  Pro  50  Slot,  2  Serial/1  Parallel/2  USB 

(c)  2GB  DDR  SDRAM  (2  Each  IGB  DDR  Registered  266MHz  DIMMs) 

(d)  2U  19  Rackmount  Chassis  with  riser  card  with  24  slide  rails 

(e)  460  Watt  power  supply 

(f)  Onboard  Adaptec  AIC-7899W  Dual  Channel  Ultra3/160  SCSI  Controller 

(g)  18GB  Ultra/160  10000  RPM  SCSI  HDD 

(h)  Integrated  ATI  RAGE  XL  4MB  Video 

(i)  3COM  3C996-T  64-bit  Gigabit  PCI  Ethernet  (TD825603) 

(j)  3  Floppy  Drive 

(k)  Red  Hat  Linux  7.1  with  MPICH  (Installed) 

Item  2  Unit  Price:  $3,125 
Item  2  Ext.  Total:  $25,000 

3.  (QTY:  1)  HP  J1470A  15  RM  Flat  Panel  Monitor/Kb/Mouse  (TD#068181)  (2U) 

Item  3  Unit  Price:  $2,595 
Item  3  Ext.  Total:  $2,595 

4.  (QTY:  1  Lot)  Raritan  MCC16  16-channel  KVM  Kit  (2U)  (Consists  of:  1-MCC16,  1-RMCS16,  16-CCP20) 

Item  4  Unit  Price;  $2,300 
Item  4  Ext.  Total:  $2,300 

5.  (QTY:  1  EA.)  40U  Microway  CoolRack  Cabinet  (Black)  (P/N  701936) 

(a)  With  2  535CFM  10  fans,  4  Each  11-outlet  power  strips 

(b)  With  1  sliding  tray  (for  rackmountable  monitor) 

(c)  Cabinet  Dimensions:  77.58H  X  23.33W  X  40.59D 

Item  5  Unit  Price:  $2,695 
Item  5  Ext.  Total:  $2,695 

6.  (QTY:  1)  PGI  Cluster  Development  Kit  (CDK)  (2  Users/64  CPUs) 

Item  6  Unit  Price;  $2,879 
Item  6  Ext.  Total:  $2,879 

7.  (QTY:  1  Lot)  Shipping  (to  loading  dock  at  Tufts  University) 

Item  7  Lot  Price:  $570 


Figure  1:  Parts  and  Price  List  for  Computer  Cluster 
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A 


Publications 


Though  publications  will  come  from  the  research  performed  on  this  computer  cluster,  no  publications  were 
made  on  the  acquisition  of  the  cluster  itself.  The  publications  on  the  research  will  be  described  in  the  final 
report  of  AFOSR  grant  F49620-01-1-0385. 


B  Invited  Talks  and  Presentations 

While  the  PI,  Professor  Bruce  Boghosian,  gave  a  number  of  invited  talks  and  presentations  during  the  period 
of  this  proposal,  these  tended  to  be  on  the  research  conducted  with  the  computer,  and  not  on  the  acquisition 
of  the  computer  itself.  For  this  reason,  these  talks  and  presentations  will  be  described  in  the  final  report  of 
AFOSR  grant  F49620-01-1-0385. 

C  Honors  and  Awards 

During  the  period  of  this  grant.  Professor  Bruce  Boghosian  received  a  Visiting  Fellowship  from  the  Reality- 
Grid  project,  based  at  Queen  Mary  College  of  the  University  of  London.  This  Fellowship  will  enable  him  to 
visit  London  during  the  summers  of  2002,  2003  and  2004,  to  learn  and  participate  in  the  Grid  Computing 
projects  in  progress  there.  The  RealityGrid  effort  is  aimed  at  using  Grid  Computing  for  problems  in  Mate¬ 
rials  Science.  This  connection  may  well  be  useful  in  determining  future  modes  of  operation  of  the  machine 
funded  by  this  grant;  that  is,  at  some  point,  we  may  wish  to  add  more  processors,  and  configure  it  as  part 
of  a  Grid  Computing  network. 
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